Criminaliteit en Welvaart#
Student names: Chris Dukker, David Snoeks, Ryan Rodrigus, Jason Roes
Team number: D2
# Load image from link
url = 'https://ccv-secondant.nl/fileadmin/w/secondant_nl/platform/illustraties/Internationaal.jpg'
# Display image from URL with smaller size and subtitle
from IPython.display import Image, display
# Set the desired image width and height
width = 600
height = 300
# Set the subtitle text
subtitle = "© CCV / Hans Sprangers"
# Create an Image instance with the URL
image = Image(url=url, width=width, height=height)
# Display the image and subtitle
display(image)
print(subtitle)

© CCV / Hans Sprangers
Introduction#
Europa is, over het algemeen genomen, welvarend. Meerdere van de sterkste economieën op aarde bevinden zich op dit continent, en de EU kan zich in economische termen meten aan andere grootmachten. Deze welvaart is alleen niet gelijkmatig verdeeld. Sommige landen zijn welvarender dan andere, maar ook binnen de landen bestaat er economische ongelijkheid doordat de welvaart verschillend is verdeeld. Ook zijn er verscillen in de criminaliteit. Landen hebben te kampen met verschillende hoeveelheden misdaad, en niet ieder land heeft last van dezelfde soorten criminaliteit. De vraag waar wij ons in dit datastory mee bezig houden is: is er een verband te vinden in (de verdeling van) welvaart van een land, en de hoeveelheid illegale activiteit in dit land?
Met deze data story onderzoeken wij of welvaartsongelijkheid en het gemiddelde inkomen per persoon invloed hebben op de hoeveelheid gepleegde misdaden, en of deze variabelen een sterkere correlatie vertonen met bepaalde categorieën misdaad zoals moord, verkrachting, diefstal en fraude. Dit doen we met behulp van gegevens van World Bank Group over de GINI-coëfficient (een maatstaf voor inkomens- of vermogenongelijkheid) en economische statistieken en groei dataset van World Bank Open Data en de misdaadstatistieken dataset van Eurostat.
Hieronder kan geschrapt worden? Volgens De Courson & Nettle (2021) is voor mensen met een laag inkomen en kapitaal de criminaliteit de beste manier om hun kwaliteit van leven te verbeteren. Hoewel er het risico is om gepakt te worden, is de mogelijke winst bij succes dit risico waard. Vanwege het kleine toekomstperspectief is misdaad voor deze bevolkingsgroep de beste manier om hun leven te verbeteren. Volgens ditzelfde onderzoek leidt een grote ongelijkheid tot meer criminaliteit, terwijl een eerlijkere verdeling van welvaart positieve effecten heeft en de mogelijke voordelen van misdaad verkleint.
Meer misdaad door welvaartsongelijkheid#
Bla bla, Criminaliteit en welvaart (ongelijkheid) hebben een sterk verband, want de argumenten hier onder.
Verband diefstal, fraude en ongelijkheid#
Correlatie 1 met grafiek
At solmen va esser necessi far uniform grammatica, pronunciation e plu sommun paroles. Ma quande lingues coalesce, li grammatica del resultant lingue es plu simplic e regulari quam ti del coalescent lingues. Li nov lingua franca va esser plu simplic e regulari quam li existent Europan lingues. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante.
Figure 2: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante.
The Second Argument of Your First Perspective#
It va esser tam simplic quam Occidental in fact, it va esser Occidental. A un Angleso it va semblar un simplificat Angles, quam un skeptic Cambridge amico dit me que Occidental es. Li Europan lingues es membres del sam familie. Lor separat existentie es un myth. Por scientie, musica, sport etc, litot Europa usa li sam vocabular. Li lingues differe solmen in li grammatica, li pronunciation e li plu commun vocabules. Omnicos directe al desirabilite de un nov lingua franca: On refusa continuar payar custosi traductores. At solmen va esser necessi far uniform grammatica, pronunciation e plu sommun paroles.
Show code cell source
import plotly.graph_objects as go
import pandas as pd
Figure 3: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
Li Europan lingues es membres del sam familie. Lor separat existentie es un myth. Por scientie, musica, sport etc, litot Europa usa li sam vocabular. Li lingues differe solmen in li grammatica, li pronunciation e li plu commun vocabules. Omnicos directe al desirabilite de un nov lingua franca: On refusa continuar payar custosi traductores.
Your Second Perspective#
Bla bla, Criminaliteit en welvaart (ongelijkheid) hebben geen sterk verband/ er zijn belangrijke factoren, want de argumenten hier onder.
The First Argument of Your Second Perspective#
Hoewel je misschien zou denken dat er in armere landen meer misdaden gebeuren, voornamelijk diefstal, wat het grootste deel van het aantal gerapporteerde misdaden uitmaakt, is dit juist niet het geval. Toen we onze dataset analyseerden vonden we juist dat over het algemeen hoe rijker een land is, hoe groter het totaal aantal gerapporteerde misdaden is.
Show code cell source
import pandas as pd
import plotly.express as px
import statsmodels
# Load and process data (same as before)
bank1_df = pd.read_csv("world_bank_definitive.csv")
crime_df = pd.read_csv("europe_crime_definitive_per_100k.csv")
bank_df = bank1_df[bank1_df['Indicator Name'] == "GDP per capita, PPP (constant 2021 international $)"]
bank_df = bank_df.rename(columns={"Value": "GDP per capita, PPP (constant 2021 international $)"})
crime_columns = [col for col in crime_df.columns if col not in ["Country Name", "Year"]]
crime_df["Total Crime Rate per 100k"] = crime_df[crime_columns].sum(axis=1)
merged_df = pd.merge(
crime_df[["Country Name", "Year", "Total Crime Rate per 100k"]],
bank_df[["Country Name", "Year", "GDP per capita, PPP (constant 2021 international $)"]],
on=["Country Name", "Year"]
)
# Create scatter plot with trendline
fig = px.scatter(
merged_df,
x="GDP per capita, PPP (constant 2021 international $)",
y="Total Crime Rate per 100k",
hover_name="Country Name",
hover_data={"Year": True},
trendline="ols", # Ordinary Least Squares regression line
title="GDP per Capita vs. Total Crime Rate per 100k with Trendline"
)
fig.update_layout(
xaxis_title="GDP per capita, PPP (constant 2021 international $)",
yaxis_title="Total Crime Rate per 100k"
)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[3], line 13
10 bank_df = bank_df.rename(columns={"Value": "GDP per capita, PPP (constant 2021 international $)"})
12 crime_columns = [col for col in crime_df.columns if col not in ["Country Name", "Year"]]
---> 13 crime_df["Total Crime Rate per 100k"] = crime_df[crime_columns].sum(axis=1)
15 merged_df = pd.merge(
16 crime_df[["Country Name", "Year", "Total Crime Rate per 100k"]],
17 bank_df[["Country Name", "Year", "GDP per capita, PPP (constant 2021 international $)"]],
18 on=["Country Name", "Year"]
19 )
21 # Create scatter plot with trendline
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11670, in DataFrame.sum(self, axis, skipna, numeric_only, min_count, **kwargs)
11661 @doc(make_doc("sum", ndim=2))
11662 def sum(
11663 self,
(...)
11668 **kwargs,
11669 ):
> 11670 result = super().sum(axis, skipna, numeric_only, min_count, **kwargs)
11671 return result.__finalize__(self, method="sum")
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/generic.py:12506, in NDFrame.sum(self, axis, skipna, numeric_only, min_count, **kwargs)
12498 def sum(
12499 self,
12500 axis: Axis | None = 0,
(...)
12504 **kwargs,
12505 ):
> 12506 return self._min_count_stat_function(
12507 "sum", nanops.nansum, axis, skipna, numeric_only, min_count, **kwargs
12508 )
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/generic.py:12489, in NDFrame._min_count_stat_function(self, name, func, axis, skipna, numeric_only, min_count, **kwargs)
12486 elif axis is lib.no_default:
12487 axis = 0
> 12489 return self._reduce(
12490 func,
12491 name=name,
12492 axis=axis,
12493 skipna=skipna,
12494 numeric_only=numeric_only,
12495 min_count=min_count,
12496 )
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11562, in DataFrame._reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
11558 df = df.T
11560 # After possibly _get_data and transposing, we are now in the
11561 # simple case where we can use BlockManager.reduce
> 11562 res = df._mgr.reduce(blk_func)
11563 out = df._constructor_from_mgr(res, axes=res.axes).iloc[0]
11564 if out_dtype is not None and out.dtype != "boolean":
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/internals/managers.py:1500, in BlockManager.reduce(self, func)
1498 res_blocks: list[Block] = []
1499 for blk in self.blocks:
-> 1500 nbs = blk.reduce(func)
1501 res_blocks.extend(nbs)
1503 index = Index([None]) # placeholder
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/internals/blocks.py:404, in Block.reduce(self, func)
398 @final
399 def reduce(self, func) -> list[Block]:
400 # We will apply the function and reshape the result into a single-row
401 # Block with the same mgr_locs; squeezing will be done at a higher level
402 assert self.ndim == 2
--> 404 result = func(self.values)
406 if self.values.ndim == 1:
407 res_values = result
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11481, in DataFrame._reduce.<locals>.blk_func(values, axis)
11479 return np.array([result])
11480 else:
> 11481 return op(values, axis=axis, skipna=skipna, **kwds)
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:85, in disallow.__call__.<locals>._f(*args, **kwargs)
81 raise TypeError(
82 f"reduction operation '{f_name}' not allowed for this dtype"
83 )
84 try:
---> 85 return f(*args, **kwargs)
86 except ValueError as e:
87 # we want to transform an object array
88 # ValueError message to the more typical TypeError
89 # e.g. this is normally a disallowed function on
90 # object arrays that contain strings
91 if is_object_dtype(args[0]):
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:404, in _datetimelike_compat.<locals>.new_func(values, axis, skipna, mask, **kwargs)
401 if datetimelike and mask is None:
402 mask = isna(values)
--> 404 result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
406 if datetimelike:
407 result = _wrap_results(result, orig_values.dtype, fill_value=iNaT)
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:477, in maybe_operate_rowwise.<locals>.newfunc(values, axis, **kwargs)
474 results = [func(x, **kwargs) for x in arrs]
475 return np.array(results)
--> 477 return func(values, axis=axis, **kwargs)
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:646, in nansum(values, axis, skipna, min_count, mask)
643 elif dtype.kind == "m":
644 dtype_sum = np.dtype(np.float64)
--> 646 the_sum = values.sum(axis, dtype=dtype_sum)
647 the_sum = _maybe_null_out(the_sum, axis, mask, values.shape, min_count=min_count)
649 return the_sum
File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/numpy/_core/_methods.py:52, in _sum(a, axis, dtype, out, keepdims, initial, where)
50 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
51 initial=_NoValue, where=True):
---> 52 return umr_sum(a, axis, dtype, out, keepdims, initial, where)
TypeError: can only concatenate str (not "int") to str
Figure 4: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur?
The Second Argument of Your Second Perspective#
Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt.
Show code cell source
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
import pycountry
pio.renderers.default = 'notebook'
# === Load GINI Data ===
gini_df = pd.read_csv("gini_definitive.csv")
gini_df['Year'] = gini_df['Year'].astype(int)
# === Load Theft Data ===
theft_df = pd.read_csv("europe_crime_definitive_absolute.csv")
theft_df.rename(columns={'geo': 'Country Code', 'TIME_PERIOD': 'Year'}, inplace=True)
# Convert ISO-2 to ISO-3
def convert_iso2_to_iso3(code):
try:
return pycountry.countries.get(alpha_2=code).alpha_3
except:
return None
theft_df['Country Code'] = theft_df['Country Code'].apply(convert_iso2_to_iso3)
# Manual fixes for special regions/countries
manual_fix = {
'England and Wales': 'GBR',
'Northern Ireland (UK) (NUTS 2021)': 'GBR',
'Scotland (NUTS 2021)': 'GBR',
'Greece': 'GRC',
'Kosovo*': 'XKX'
}
theft_df['Country Code'] = theft_df.apply(
lambda row: manual_fix[row['Geopolitical entity (reporting)']]
if pd.isnull(row['Country Code']) and row['Geopolitical entity (reporting)'] in manual_fix
else row['Country Code'],
axis=1
)
theft_df['Year'] = theft_df['Year'].astype(int)
theft_df['Theft'] = pd.to_numeric(theft_df['Theft'], errors='coerce').fillna(0)
# === Years intersection and max year 2022 ===
years = sorted(list(set(gini_df['Year']).intersection(set(theft_df['Year']))))
years = [year for year in years if year <= 2022]
# Create subplot
fig = make_subplots(
rows=1, cols=2,
specs=[[{'type': 'choropleth'}, {'type': 'choropleth'}]],
subplot_titles=('GINI Index', 'Theft Incidents')
)
# Define color scales
gini_min, gini_max = gini_df['Value'].min(), gini_df['Value'].max()
theft_min, theft_max = theft_df['Theft'].min(), theft_df['Theft'].max()
# Add base traces (Year = first year)
fig.add_trace(
go.Choropleth(
locations=gini_df[gini_df['Year'] == years[0]]['Country Code'],
z=gini_df[gini_df['Year'] == years[0]]['Value'],
text=gini_df[gini_df['Year'] == years[0]]['Country Name'],
colorscale='Viridis',
zmin=gini_min,
zmax=gini_max,
colorbar=dict(title='GINI', x=0.45) # position colorbar left
),
row=1, col=1
)
fig.add_trace(
go.Choropleth(
locations=theft_df[theft_df['Year'] == years[0]]['Country Code'],
z=theft_df[theft_df['Year'] == years[0]]['Theft'],
text=theft_df[theft_df['Year'] == years[0]]['Geopolitical entity (reporting)'],
colorscale='Reds',
zmin=theft_min,
zmax=theft_max,
colorbar=dict(title='Theft', x=1.0) # position colorbar right
),
row=1, col=2
)
# Animation frames
frames = []
for year in years:
frame = go.Frame(
data=[
go.Choropleth(
locations=gini_df[gini_df['Year'] == year]['Country Code'],
z=gini_df[gini_df['Year'] == year]['Value'],
text=gini_df[gini_df['Year'] == year]['Country Name']
),
go.Choropleth(
locations=theft_df[theft_df['Year'] == year]['Country Code'],
z=theft_df[theft_df['Year'] == year]['Theft'],
text=theft_df[theft_df['Year'] == year]['Geopolitical entity (reporting)']
)
],
name=str(year)
)
frames.append(frame)
# Update layout
fig.update_layout(
title_text='GINI Index and Theft Incidents in Europe per Year',
title_x=0.5,
geo=dict(
showframe=False,
showcoastlines=True,
lataxis_range=[30, 72],
lonaxis_range=[-25, 45],
projection_type='natural earth'
),
geo2=dict( # for the 2nd map
showframe=False,
showcoastlines=True,
lataxis_range=[30, 72],
lonaxis_range=[-25, 45],
projection_type='natural earth'
),
sliders=[{
"steps": [{
"args": [[str(year)], {"frame": {"duration": 500, "redraw": True}, "mode": "immediate"}],
"label": str(year),
"method": "animate"
} for year in years],
"transition": {"duration": 300},
"x": 0.1,
"len": 0.8
}],
updatemenus=[{
"buttons": [{
"args": [None, {"frame": {"duration": 500, "redraw": True}, "fromcurrent": True}],
"label": "Play",
"method": "animate"
}, {
"args": [[None], {"frame": {"duration": 0}, "mode": "immediate"}],
"label": "Pause",
"method": "animate"
}],
"direction": "left",
"pad": {"r": 10, "t": 70},
"showactive": False,
"type": "buttons",
"x": 0.1,
"xanchor": "right",
"y": 0,
"yanchor": "top"
}]
)
fig.frames = frames
fig.show()
Figure 5: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
Show code cell source
import pandas as pd
import plotly.express as px
import random
Figure 6: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.
Reflection#
Curabitur non lacus ex. Maecenas at massa ultricies justo venenatis condimentum sed et eros. Ut vitae iaculis massa. Aenean vitae sagittis nibh. Aliquam pharetra dui suscipit purus dictum rutrum. Donec ultricies odio quis porttitor aliquet. Fusce sed nisl non velit rutrum commodo nec sed magna. Morbi non volutpat mi, cursus pulvinar dolor.
Nam sit amet volutpat sapien. Aenean eu mattis neque. Maecenas eget libero consequat, condimentum nulla luctus, fermentum lectus. Donec at enim sit amet dolor vestibulum faucibus. Vestibulum velit elit, faucibus ut mi sit amet, mollis rutrum eros. Ut ut lacinia ante, eu placerat ligula. Fusce quis convallis purus. Maecenas eget fringilla quam.
Proin ac sapien et lectus tempor dignissim a at arcu. Donec placerat aliquet odio, vel aliquam nibh tempus vel. Pellentesque non velit iaculis, porta metus sed, dictum augue. Aenean tempus gravida ullamcorper. Proin cursus fringilla turpis. Integer id lectus dignissim, ultrices metus vel, dictum quam. Suspendisse augue ligula, vestibulum ac nulla a, porta pharetra leo. Integer et pharetra lacus, in porttitor mauris. Cras sodales metus sit amet enim rhoncus sodales. Etiam orci enim, tincidunt eget arcu vel, gravida scelerisque lacus.
Work Distribution#
Jason richtte zich op het preprocessen van de datasets. Hierna focusde hij zich vooral op het coordineren van de samenwerking en begeleidende tekst voor het datastory.
References#
De Courson, B., Nettle, D. Why do inequality and deprivation produce high crime and low trust?. Sci Rep 11, 1937 (2021). https://doi.org/10.1038/s41598-020-80897-8